Discrimination-Aware Association Rule Mining for Unbiased Data Analytics

نویسندگان

  • Ling Luo
  • Wei Liu
  • Irena Koprinska
  • Fang Chen
چکیده

A discriminatory dataset refers to a dataset with undesirable correlation between sensitive attributes and the class label, which often leads to biased decision making in data analytics processes. This paper investigates how to build discrimination-aware models even when the available training set is intrinsically discriminating based on some sensitive attributes, such as race, gender or personal status. We propose a new classification method called DiscriminationAware Association Rule classifier (DAAR), which integrates a new discrimination-aware measure and an association rule mining algorithm. We evaluate the performance of DAAR on three real datasets from different domains and compare it with two non-discrimination-aware classifiers (a standard association rule classification algorithm and the state-of-the-art association rule algorithm SPARCCC), and also with a recently proposed discrimination-aware decision tree method. The results show that DAAR is able to effectively filter out the discriminatory rules and decrease the discrimination on all datasets with insignificant impact on the predictive accuracy.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Association Analysis of Semi-structured Data for Discrimination Discovery in Business

Data mining techniques have taken a critical role in life in numerous domains such as consumer analytics, finance, banking, medicine, biology, and astronomy... Recently, data mining techniques have found their application also in discovering illegal discriminatory treatment on the bases of sensitive attributes such as race, color, religion, nationality, gender, age... In this paper, we propose ...

متن کامل

Data sanitization in association rule mining based on impact factor

Data sanitization is a process that is used to promote the sharing of transactional databases among organizations and businesses, it alleviates concerns for individuals and organizations regarding the disclosure of sensitive patterns. It transforms the source database into a released database so that counterparts cannot discover the sensitive patterns and so data confidentiality is preserved ag...

متن کامل

Optimizing Membership Functions using Learning Automata for Fuzzy Association Rule Mining

The Transactions in web data often consist of quantitative data, suggesting that fuzzy set theory can be used to represent such data. The time spent by users on each web page is one type of web data, was regarded as a trapezoidal membership function (TMF) and can be used to evaluate user browsing behavior. The quality of mining fuzzy association rules depends on membership functions and since t...

متن کامل

Handling Data Skew in MapReduce Cluster by Using Partition Tuning

The healthcare industry has generated large amounts of data, and analyzing these has emerged as an important problem in recent years. The MapReduce programming model has been successfully used for big data analytics. However, data skew invariably occurs in big data analytics and seriously affects efficiency. To overcome the data skew problem in MapReduce, we have in the past proposed a data pro...

متن کامل

Identification of Best Algorithm in Association Rule Mining Based on Performance

Data Mining finds hidden pattern in data sets and association between the patterns. To achieve the objective of data mining association rule mining is one of the important techniques. Association rule mining is a particularly well studied field in data mining given its importance as a building block in many data analytics tasks. Many studies have focused on efficiency because the data to be min...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015